Preliminaries

Load packages

8.1 μs
30.6 s

Load the Moving to Opportunities neighborhoods dataset (Chetty and Hendren, 2018) and remove missing entries

3.6 μs
nbhood_csv
CZ Name_of_CZCensus_2000_populationState_Abbrev.StateUrban_Areasp25_coefp25_sep25_se_boot
Int64StringFloat64StringStringInt64Float64Float64Float64
1
100
"Johnson City"
576081.0
"TN"
"Tennessee"
1
-0.416
0.25
0.273
2
200
"Morristown"
227816.0
"TN"
"Tennessee"
1
-0.043
0.249
0.266
3
301
"Middlesborough"
66708.0
"TN"
"Tennessee"
0
0.428
0.568
0.578
4
302
"Knoxville"
727600.0
"TN"
"Tennessee"
1
-0.055
0.183
0.187
5
401
"Winston-Salem"
493180.0
"NC"
"North Carolina"
1
-0.443
0.199
0.201
6
402
"Martinsville"
92753.0
"VA"
"Virginia"
0
-0.47
0.413
0.411
7
500
"Greensboro"
1.05513e6
"NC"
"North Carolina"
1
-0.347
0.132
0.134
8
601
"North Wilkesboro"
90016.0
"NC"
"North Carolina"
0
0.237
0.572
0.519
9
602
"Galax"
64676.0
"VA"
"Virginia"
0
-1.085
0.692
0.806
10
700
"Spartanburg"
354533.0
"SC"
"South Carolina"
1
-0.233
0.23
0.243
more
595
39400
"Seattle"
3.77574e6
"WA"
"Washington"
1
0.229
0.082
0.089
14.0 s

Wrap the data in a type that represents the Gaussian likelihood:

2.9 μs
Zs
63.1 ms

Let us define the empirical Bayes targets we want to estimate. We will estimate E[μZ=z,σ=σ] where we vary σ{0.5,1,2} and z{3σ,2.9σ,,2.9σ,3σ}.

3.6 μs
σs
1×3 Array{Float64,2}:
 0.5  1.0  2.0
635 ns
targets
61×3 Array{PosteriorMean{NormalSample{Float64,Float64}},2}:
 PosteriorMean{NormalSample{Float64,Float64}}(Z=    -1.5 | σ=0.5  )  …  PosteriorMean{NormalSample{Float64,Float64}}(Z=    -6.0 | σ=2.0  )
 PosteriorMean{NormalSample{Float64,Float64}}(Z=   -1.45 | σ=0.5  )     PosteriorMean{NormalSample{Float64,Float64}}(Z=    -5.8 | σ=2.0  )
 PosteriorMean{NormalSample{Float64,Float64}}(Z=    -1.4 | σ=0.5  )     PosteriorMean{NormalSample{Float64,Float64}}(Z=    -5.6 | σ=2.0  )
 PosteriorMean{NormalSample{Float64,Float64}}(Z=   -1.35 | σ=0.5  )     PosteriorMean{NormalSample{Float64,Float64}}(Z=    -5.4 | σ=2.0  )
 PosteriorMean{NormalSample{Float64,Float64}}(Z=    -1.3 | σ=0.5  )     PosteriorMean{NormalSample{Float64,Float64}}(Z=    -5.2 | σ=2.0  )
 PosteriorMean{NormalSample{Float64,Float64}}(Z=   -1.25 | σ=0.5  )  …  PosteriorMean{NormalSample{Float64,Float64}}(Z=    -5.0 | σ=2.0  )
 PosteriorMean{NormalSample{Float64,Float64}}(Z=    -1.2 | σ=0.5  )     PosteriorMean{NormalSample{Float64,Float64}}(Z=    -4.8 | σ=2.0  )
 ⋮                                                                   ⋱  
 PosteriorMean{NormalSample{Float64,Float64}}(Z=    1.25 | σ=0.5  )  …  PosteriorMean{NormalSample{Float64,Float64}}(Z=     5.0 | σ=2.0  )
 PosteriorMean{NormalSample{Float64,Float64}}(Z=     1.3 | σ=0.5  )     PosteriorMean{NormalSample{Float64,Float64}}(Z=     5.2 | σ=2.0  )
 PosteriorMean{NormalSample{Float64,Float64}}(Z=    1.35 | σ=0.5  )     PosteriorMean{NormalSample{Float64,Float64}}(Z=     5.4 | σ=2.0  )
 PosteriorMean{NormalSample{Float64,Float64}}(Z=     1.4 | σ=0.5  )     PosteriorMean{NormalSample{Float64,Float64}}(Z=     5.6 | σ=2.0  )
 PosteriorMean{NormalSample{Float64,Float64}}(Z=    1.45 | σ=0.5  )     PosteriorMean{NormalSample{Float64,Float64}}(Z=     5.8 | σ=2.0  )
 PosteriorMean{NormalSample{Float64,Float64}}(Z=     1.5 | σ=0.5  )  …  PosteriorMean{NormalSample{Float64,Float64}}(Z=     6.0 | σ=2.0  )
116 ms
target_names
1×3 Array{LaTeXString,2}:
 L"$E[\mu \mid Z=z, \sigma=0.5]$"  …  L"$E[\mu \mid Z=z, \sigma=2.0]$"
48.1 ms

Smooth prior class

Throughout this Section we first assume that the true prior G is smooth and lies in the class of mixtures of

{N(μ,0.252),μ{3,2.9,,3}}.

4.1 μs
gcal_smooth
MixturePriorClass (K = 601)
Distributions.Normal{Float64}(μ=-3.0, σ=0.25)
Distributions.Normal{Float64}(μ=-2.99, σ=0.25)
Distributions.Normal{Float64}(μ=-2.98, σ=0.25)
Distributions.Normal{Float64}(μ=-2.97, σ=0.25)
Distributions.Normal{Float64}(μ=-2.96, σ=0.25)
Distributions.Normal{Float64}(μ=-2.95, σ=0.25)
Distributions.Normal{Float64}(μ=-2.94, σ=0.25)
Distributions.Normal{Float64}(μ=-2.93, σ=0.25)
The rest are omitted ...
74.9 ms

We start by fitting the NPMLE to that prior class (this is not needed for the intervals).

2.9 μs
18.3 s
8.4 s

Start with confidence interval construction (compound DKW-F-Localization)

3.0 μs
floc_method_smooth
24.4 ms
postmean_cis_smooth
61×3 Array{Empirikos.LowerUpperConfidenceInterval,2}:
 lower = -1.244, upper = -0.2099, α = 0.05  (PosteriorMean{NormalSample{Float64,Float64}}(Z=    -1.5 | σ=0.5  ))   …  lower = -1.587, upper = 0.01869, α = 0.05  (PosteriorMean{NormalSample{Float64,Float64}}(Z=    -6.0 | σ=2.0  ))
 lower = -1.113, upper = -0.1999, α = 0.05  (PosteriorMean{NormalSample{Float64,Float64}}(Z=   -1.45 | σ=0.5  ))      lower = -1.465, upper = 0.02154, α = 0.05  (PosteriorMean{NormalSample{Float64,Float64}}(Z=    -5.8 | σ=2.0  ))
 lower = -0.9878, upper = -0.1899, α = 0.05  (PosteriorMean{NormalSample{Float64,Float64}}(Z=    -1.4 | σ=0.5  ))     lower = -1.344, upper = 0.02484, α = 0.05  (PosteriorMean{NormalSample{Float64,Float64}}(Z=    -5.6 | σ=2.0  ))
 lower = -0.8702, upper = -0.1799, α = 0.05  (PosteriorMean{NormalSample{Float64,Float64}}(Z=   -1.35 | σ=0.5  ))     lower = -1.224, upper = 0.02792, α = 0.05  (PosteriorMean{NormalSample{Float64,Float64}}(Z=    -5.4 | σ=2.0  ))
 lower = -0.7615, upper = -0.1699, α = 0.05  (PosteriorMean{NormalSample{Float64,Float64}}(Z=    -1.3 | σ=0.5  ))     lower = -1.108, upper = 0.031, α = 0.05  (PosteriorMean{NormalSample{Float64,Float64}}(Z=    -5.2 | σ=2.0  ))
 lower = -0.6629, upper = -0.1591, α = 0.05  (PosteriorMean{NormalSample{Float64,Float64}}(Z=   -1.25 | σ=0.5  ))  …  lower = -0.9958, upper = 0.03407, α = 0.05  (PosteriorMean{NormalSample{Float64,Float64}}(Z=    -5.0 | σ=2.0  ))
 lower = -0.5747, upper = -0.144, α = 0.05  (PosteriorMean{NormalSample{Float64,Float64}}(Z=    -1.2 | σ=0.5  ))      lower = -0.889, upper = 0.03715, α = 0.05  (PosteriorMean{NormalSample{Float64,Float64}}(Z=    -4.8 | σ=2.0  ))
 ⋮                                                                                                                 ⋱  
 lower = 0.304, upper = 0.9541, α = 0.05  (PosteriorMean{NormalSample{Float64,Float64}}(Z=    1.25 | σ=0.5  ))     …  lower = 0.1529, upper = 1.584, α = 0.05  (PosteriorMean{NormalSample{Float64,Float64}}(Z=     5.0 | σ=2.0  ))
 lower = 0.3213, upper = 1.049, α = 0.05  (PosteriorMean{NormalSample{Float64,Float64}}(Z=     1.3 | σ=0.5  ))        lower = 0.156, upper = 1.691, α = 0.05  (PosteriorMean{NormalSample{Float64,Float64}}(Z=     5.2 | σ=2.0  ))
 lower = 0.3318, upper = 1.142, α = 0.05  (PosteriorMean{NormalSample{Float64,Float64}}(Z=    1.35 | σ=0.5  ))        lower = 0.1591, upper = 1.797, α = 0.05  (PosteriorMean{NormalSample{Float64,Float64}}(Z=     5.4 | σ=2.0  ))
 lower = 0.3418, upper = 1.24, α = 0.05  (PosteriorMean{NormalSample{Float64,Float64}}(Z=     1.4 | σ=0.5  ))         lower = 0.1621, upper = 1.902, α = 0.05  (PosteriorMean{NormalSample{Float64,Float64}}(Z=     5.6 | σ=2.0  ))
 lower = 0.3518, upper = 1.337, α = 0.05  (PosteriorMean{NormalSample{Float64,Float64}}(Z=    1.45 | σ=0.5  ))        lower = 0.1654, upper = 2.004, α = 0.05  (PosteriorMean{NormalSample{Float64,Float64}}(Z=     5.8 | σ=2.0  ))
 lower = 0.3618, upper = 1.429, α = 0.05  (PosteriorMean{NormalSample{Float64,Float64}}(Z=     1.5 | σ=0.5  ))     …  lower = 0.1683, upper = 2.102, α = 0.05  (PosteriorMean{NormalSample{Float64,Float64}}(Z=     6.0 | σ=2.0  ))
130 s

Also evaluate the plugin estimates based on the NPMLE and finally plot everything (interval bands in blue, NPMLE estimate in dashed purple and z ↦ z dotted black).

6.9 μs
87.7 ms
113 ms

Discrete class

Now instead we make a more standard nonparametric EB assumption, i.e. that G is an arbitrary discrete distribution supported on the grid 3,2.9,,2.9,3 and repeat the same steps as we did for the smooth class above.

5.2 μs
discrete_class
DiscretePriorClass | support = -3.0:0.01:3.0
4.3 ms
1.7 s
127 ms
floc_method_discrete
13.3 ms
39.0 ns
postmean_cis_discrete
61×3 Array{Empirikos.LowerUpperConfidenceInterval,2}:
 lower = -1.709, upper = 0.09511, α = 0.05  (PosteriorMean{NormalSample{Float64,Float64}}(Z=    -1.5 | σ=0.5  ))  …  lower = -1.96, upper = 0.09678, α = 0.05  (PosteriorMean{NormalSample{Float64,Float64}}(Z=    -6.0 | σ=2.0  ))
 lower = -1.562, upper = 0.09512, α = 0.05  (PosteriorMean{NormalSample{Float64,Float64}}(Z=   -1.45 | σ=0.5  ))     lower = -1.851, upper = 0.09713, α = 0.05  (PosteriorMean{NormalSample{Float64,Float64}}(Z=    -5.8 | σ=2.0  ))
 lower = -1.413, upper = 0.09512, α = 0.05  (PosteriorMean{NormalSample{Float64,Float64}}(Z=    -1.4 | σ=0.5  ))     lower = -1.737, upper = 0.09728, α = 0.05  (PosteriorMean{NormalSample{Float64,Float64}}(Z=    -5.6 | σ=2.0  ))
 lower = -1.265, upper = 0.09513, α = 0.05  (PosteriorMean{NormalSample{Float64,Float64}}(Z=   -1.35 | σ=0.5  ))     lower = -1.62, upper = 0.09766, α = 0.05  (PosteriorMean{NormalSample{Float64,Float64}}(Z=    -5.4 | σ=2.0  ))
 lower = -1.119, upper = 0.09513, α = 0.05  (PosteriorMean{NormalSample{Float64,Float64}}(Z=    -1.3 | σ=0.5  ))     lower = -1.5, upper = 0.09824, α = 0.05  (PosteriorMean{NormalSample{Float64,Float64}}(Z=    -5.2 | σ=2.0  ))
 lower = -0.977, upper = 0.1022, α = 0.05  (PosteriorMean{NormalSample{Float64,Float64}}(Z=   -1.25 | σ=0.5  ))   …  lower = -1.38, upper = 0.09901, α = 0.05  (PosteriorMean{NormalSample{Float64,Float64}}(Z=    -5.0 | σ=2.0  ))
 lower = -0.8433, upper = 0.1089, α = 0.05  (PosteriorMean{NormalSample{Float64,Float64}}(Z=    -1.2 | σ=0.5  ))     lower = -1.26, upper = 0.1, α = 0.05  (PosteriorMean{NormalSample{Float64,Float64}}(Z=    -4.8 | σ=2.0  ))
 ⋮                                                                                                                ⋱  
 lower = 0.01533, upper = 1.175, α = 0.05  (PosteriorMean{NormalSample{Float64,Float64}}(Z=    1.25 | σ=0.5  ))   …  lower = 0.08832, upper = 1.598, α = 0.05  (PosteriorMean{NormalSample{Float64,Float64}}(Z=     5.0 | σ=2.0  ))
 lower = 0.02487, upper = 1.283, α = 0.05  (PosteriorMean{NormalSample{Float64,Float64}}(Z=     1.3 | σ=0.5  ))      lower = 0.09044, upper = 1.708, α = 0.05  (PosteriorMean{NormalSample{Float64,Float64}}(Z=     5.2 | σ=2.0  ))
 lower = 0.04172, upper = 1.382, α = 0.05  (PosteriorMean{NormalSample{Float64,Float64}}(Z=    1.35 | σ=0.5  ))      lower = 0.09243, upper = 1.816, α = 0.05  (PosteriorMean{NormalSample{Float64,Float64}}(Z=     5.4 | σ=2.0  ))
 lower = 0.06855, upper = 1.496, α = 0.05  (PosteriorMean{NormalSample{Float64,Float64}}(Z=     1.4 | σ=0.5  ))      lower = 0.09438, upper = 1.921, α = 0.05  (PosteriorMean{NormalSample{Float64,Float64}}(Z=     5.6 | σ=2.0  ))
 lower = 0.09328, upper = 1.619, α = 0.05  (PosteriorMean{NormalSample{Float64,Float64}}(Z=    1.45 | σ=0.5  ))      lower = 0.09638, upper = 2.022, α = 0.05  (PosteriorMean{NormalSample{Float64,Float64}}(Z=     5.8 | σ=2.0  ))
 lower = 0.1057, upper = 1.731, α = 0.05  (PosteriorMean{NormalSample{Float64,Float64}}(Z=     1.5 | σ=0.5  ))    …  lower = 0.09841, upper = 2.117, α = 0.05  (PosteriorMean{NormalSample{Float64,Float64}}(Z=     6.0 | σ=2.0  ))
137 s
targets_discrete_npmle
61×3 Array{Float64,2}:
 -0.0661083  -0.0313622  -0.00599918
 -0.0639283  -0.0300027  -0.00496677
 -0.0617293  -0.0286258  -0.00391495
 -0.0595121  -0.02723    -0.00284286
 -0.0572774  -0.025814   -0.00174962
 -0.0550258  -0.0243759  -0.000634291
 -0.0527579  -0.022914    0.000504118
  ⋮                      
  0.410241    0.254003    0.139249
  0.448052    0.27757     0.147126
  0.486403    0.304242    0.155606
  0.524787    0.334624    0.164754
  0.562703    0.369437    0.174644
  0.599696    0.409529    0.185357
2.5 ms
10.4 ms
1×3 Array{Array{Float64,1},2}:
 [-1.5, 1.5]  [-3.0, 3.0]  [-6.0, 6.0]
2.4 μs

AMARI intervals

Here we also show how the AMARI intervals may be computed. We assume that the prior lies in the discrete prior class and also compare AMARI intervals to the compound-F-localization intervals (for the posterior mean of the neighborhood causal effect of Yuma).

7.7 μs
yuma_df
CZ Name_of_CZCensus_2000_populationState_Abbrev.StateUrban_Areasp25_coefp25_sep25_se_boot
Int64StringFloat64StringStringInt64Float64Float64Float64
1
38100
"Yuma"
302387.0
"CA"
"California"
1
-0.216
0.172
0.175
199 ms
yuma_posterior_mean
6.3 ms

Let us first look at the plug-in estimate from the NPMLE.

2.9 μs
-0.0641787365657269
18.1 μs

Next we compute the compound-F-localization interval:

2.8 μs
ci_floc
lower = -0.2179, upper = 0.1232, α = 0.05  (PosteriorMean{NormalSample{Float64,Float64}}(Z=  -0.216 | σ=0.172))
23.1 s

And finally the AMARI interval.

2.7 μs
amari
37.2 ms
ci_amari
lower = -0.1011, upper = -0.02174, α = 0.05  (PosteriorMean{NormalSample{Float64,Float64}}(Z=  -0.216 | σ=0.172))
686 s

The AMARI interval does not contain zero (and so we can reject the null that the posterior mean for Yuma is nonnegative), while the F-localization intervals do contain zero.

3.4 μs